Data-Blocks and Header-Word Format

Pointers to data-blocks have the following format:

----------------------------------------------------------------
|      Dual-word address of data-block (29 bits)       | 1 1    1 |
----------------------------------------------------------------

The word pointed to by the above descriptor is a header-word, and it has the same format as an other-immediate:

----------------------------------------------------------------
|   Data (24 bits)        | Type (8 bits with low-tag) | 0 1 0 |
----------------------------------------------------------------

This is convenient for scanning the heap when GC'ing, but it does mean that whenever GC encounters an other-immediate word, it has to do a range check on the low byte to see if it is a header-word or just a character (for example). This is easily acceptable performance hit for scanning.

The system interprets the data portion of the header-word for non-vector data-blocks as the word length excluding the header-word. For example, the data field of the header for ratio and complex numbers is two, one word each for the numerator and denominator or for the real and imaginary parts.

For vectors and data-blocks representing Lisp objects stored like vectors, the system ignores the data portion of the header-word:

----------------------------------------------------------------
| Unused Data (24 bits)   | Type (8 bits with low-tag) | 0 1 0 |
----------------------------------------------------------------
|           Element Length of Vector (30 bits)           | 0 0 | 
----------------------------------------------------------------

Using a separate word allows for much larger vectors, and it allows length to simply access a single word without masking or shifting. Similarly, the header for complex arrays and vectors has a second word, following the header-word, the system uses for the fill pointer, so computing the length of any array is the same code sequence.